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(Currently Amended) A system for creating an aggregated data model from 
a plurality of data distribution models, each data distribution model 
describing a data distribution having one or more data elements, each data 
element having a value, each data distribution model having one or more 
bins, each bin comprising a start point having a value, an end point having a 
value, a value indicating the number of data elements for each bin, and a 
polynomial formula associated with each bin, the polynomial formula 
approximating the data elements for the respective bin, said system 
comprising: 

a processor; and 

a computer program executable on a processor said processor , the 
computer program adapted to perform the following steps: 

(a) determining which start point has the minimum value and 
which end point has the maximum value of all of the bins of 
all of the data distribution models; 

(b) setting a start point of a first bin of the aggregated data model 
to said start point determined to have the minimum value; 

(c) setting an end point of a last bin of the aggregated data model 
to said end point determined to have the maximum value; 

(d) determining a total number of a plurality of points for the 
aggregated data model by adding the values indicating the 
number of data elements from all bins from all data 
distribution models; 

(e) approximating the data elements in the data distribution 
described by each data distribution model using the start 
point, polynomial formula, and number of data elements for 
each bin in each respective data distribution model, each 
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approximated data element comprising one of said points in 
the aggregated data model; 

(f) sorting the points from minimum to maximum; 

(g) distributing the points into one or more bins in the aggregated 
data model such that a substantially equal number of points 
are in each bin of the aggregated data model; and 

(h) determining a polynomial formula with the sorted data 
elements for each bin of the aggregated data model. 

2. (Original) The system of claim 1, wherein the computer program is fiirther 
for determining the end point for each bin in the aggregated data model. 

3. (Currently Amended) The system of claim 1, wherein the computer 
program is adapted to perform the step of distributing the points into the 
one or more bins of the aggregated data model according to the following 
formula:: 

(gXl^ if the number of points in the aggregated data model is 
equally divisible into the number of bins, the end point of the 
first bin is equal to the value of the ith point in the aggregated 
data model, wherein i is the number of points in each bin 
determined by dividing the points equally into the number of 
bins, wherein the value of the end point of each bin is equal to 
the value of the ith point after the last point in the proceeding 
bin, wherein the start point of each bin is equal to the point 
after the last point of the previous bin, else 
(b) (gX2) if the number of data elements in the points is not 

equally divisible by the number of bins, then the number of 
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points in each bin is determined by dividing the number of 
points by the number of bins, and then adding one to the 
count of the points in each of a number of bins equal to the 
remainder after dividing, wherein the bins that have one 
added to the count is determined according to the following 
formula: 

for k from 1 to r 

binadd = INT((n*k)/(r+l)) 

nextk 

wherein binadd is the sequential bin nimiber to add one to the 
count of points to include therein, n is the total number of 
bins in the aggregated data model, r is the remainder from 
dividing the number of points in the data distribution by the 
number of bins, and INT is a function for rounding the result 
of the bracketed formula to produce an integer result. 

(Original) The system of claim 1, wherein the computer program is for 
performing separately for each bin of the aggregated data model, the steps 
of approximating the data elements for each bin, determining the end point 
for each bin, and determining the polynomial formula for each bin. 

(Currently Amended) The system of claim 1, wherein each data distribution 
model is the result of the computer program performing a-fee the following 
steps: 

(a) (A) sorting the data elements in each data distribution from 
minimum to maximu m for each data distribution ; 
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(b) {B} computing the number of data elements in each data 
distribution; 

(e) (Q deteraiining the value of the start point and the value of the 
end point of each bin by dividing the data elements into a 
plurality of substantially equal sized bins for each data 
distribution; 

(D) counting the number of data elements in each bin for each 
data distribution; and 
(e) (E) computing each distribution model for each data distribution, 
each distribution model comprising, for each bin, the start 
point of the bin, the end point of the bin, and the number of 
data elements in the bin. 

6. (Currently Amended) The system of claim 5, wherein the computer 

program is adapted to perform the following steps for determining the start 
points and end points of the bins for each data distribution model: 

(C)(1) selecting as the start point of the first bin the value of the data 
element having the minimum value in the sorted data 
distribution; 

det^ mining th e start point and e nd point of e ach bin according to th e 
following crit e ria: 

(e) (C)(2) if the number of data elements in the data distribution is 
equally divisible into the number of bins, the end point of the 
first bin is equal to the value of the ith data element in the data 
distribution, wherein i is the number of data elements in each 
bin determined by dividing the data elements equally into the 
number of bins, wherein the value of the end point of each bin 
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following the first bin is equal to the value of the ith data 
element after the last data element in the proceeding bin, 
wherein the start point of each bin is equal to the data element 
after the last data element of the previous bin, else and 
(C)(3) if the number of data elements in the data distribution is 
not equally divisible by the number of bins, then the number 
of data elements in each bin is determined by dividing the 
number of data elements by the number of bins, and then 
adding one to the count of the data elements in each of a 
number of bins equal to the remainder after dividing, wherein 
the bins that have one added to the count is determined 
according to the following formula: 
for k firom 1 to r 

binadd = INT((n*k)/(r+l)) 

next k 

wherein binadd is the sequential bin number to add one to the 
coimt of data elements to include therein, n is the total 
number of bins in the data distribution model, r is the 
remainder fi-om dividing the number of data elements in the 
data distribution by the number of bins, and INT is a fimction 
for rounding the result of the bracketed formula to produce an 
integer result. 

7. (Original) The system of claim 6, wherein the computer program is fiirther 
for performing the step of counting by coimting, for each bin, each data 
element satisfying the following formula: 

start point < element value <= end point 
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wherein the bin start point is the start point of the respective bin, element 
value is the value of each data element in each bin, and end point is the end 
point of the respective bin. 

8. (Original) The system of claim 7, comprising a storage medium for storing 
each data distribution model by storing, for each bin, the start point, the end 
point, the number of data elements, and the parameters of the polynomial 
formula. 

9. (Original) The system of claim 1, wherein the computer program is further 
for performing one or more statistical analysis using the aggregated data 
model. 

10. (Original) The system of claim 9, wherein the statistical analysis performed 
comprises determining the range of the points of the aggregated data model 
analyzed by subtracting end point of the last bin in the aggregated data 
model from the start point of the first bin in the aggregated data model. 

1 1 . (Original) The system of claim 9, wherein the statistical analysis performed 
comprises determining the inter quantile range of the points of the 
aggregated data model. 

12. (Original) The system of claim 9, wherein the statistical analysis performed 
comprises determining the median value of the aggregated data model by 
determining a number j computed by dividing the number of bins by 2, and 
then reading the value of the end point of the jth bin as the median value if 
the number of bins in the aggregated data model is equally divisible by 2 or 



7 



NCR Docket No. 10995 



by reading the interpolated value using the polynomial function of the mid 
point of the jth bin if the number of bins in the aggregated data model is not 
equally divisible by 2. 

13. (Canceled) 

14. (Canceled) 

15. (Canceled) 

16. (Canceled) 

17. (Canceled) 

18. (Canceled) 

19. (Canceled) 

20. (Canceled) 

21. (Canceled) 

22. (Canceled) 

23. (Canceled) 

24. (Canceled) 



